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1. INTRODUCTION 

Fruits are essential composite of human nutrition as recommended by the World Health 
Organization (WHO) which specified 400g of fruits and vegetables (FVs) in a day [1]. In an ideal situation, 
the steps for converting FVs harvested from the farm to either raw materials or finished products are 
cleaning, sorting, processing, mixing, liquefaction and distillation, filling and capping, packaging, storage, 
and preservation for its application as raw materials used by [2]-[6] in the production of fruit juice and syrup, 
canned fruit, and alcoholic beverages [7]. 

The steps of converting these fruits to either finished products or raw materials vary but should 
contain one or more of the mentioned. Often, FVs arrive at the production plant with significant variation in 
the quality which can be categorized as unripe, ripe, and rotten conditions. These conditions can be attributed 
to the distance between the farm to plant, logistic miscalculation, and the capacity of the plant. These 
conditions affect the fruit's yield, quality of taste, nutrient factors, and profit margin of the business. The 
conventional approach of manual inspection used by the manufacturers in detecting, categorizing, and sorting 
fruits and vegetables has proven not to be reliable and effective because of the clear reason of human fatigue 
and error [8], [9]. Therefore, manufacturers must have a reliable mechanism that can detect, categorize, and 
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sort fruits and vegetables based on their conditions on arrival to ensure the best quality product. The advent 
of industry 4.0 has created a hybrid approach in overcoming such challenges and other arising ones for 
production industries thereby making production lines smarter, more flexible, and reducing production cost 
by exploration of the internet of things (IoT), artificial intelligence, embedded software systems, cloud 
computing, augmented reality, additive manufacturing, and big data and security [10]-[12]. 

A research work carried out by Bhargava Bansal [13] on four types of fruits (orange, apple, banana, 
and avocado) extracted the geometric features of the fruits using four classifiers; k-nearest neighbor (k-NN), 
support vector machine (SVM), sparse representative classifier (SRC), and artificial neural network (ANN) 
to develop the classifier model to predict fruit quality. The author indicated that SVM was rated the best with 
an accuracy of 98.48% for fruit detection and 95.72% for fruit quality. Blasco et al. [14] used Bayesian 
discriminant analysis (BDA) for the segmentation process, allowing fruits to be precisely identified from the 
background and detect spots on the fruit with an accuracy of 86%. Another study used the combination of 
SVM, feature extraction algorithm, and X-ray computer tomography effectively determine the internal 
conditions of pear fruits. The developed classifiers were able to differentiate defective in the internal quality 
of fruit quality with an accuracy of 90-95% [15]. In the work of Mumera et al. [16], hyperspectral imaging 
(visible and near Infrared) approach combined with a machine learning-based classifier (random forest and 
extreme gradient boost XGBoost), were used to identify both external and internal defects in ‘Algerie’ loquat 
fruit. The success rate of 97.5%, 96.7%, and 95.9% were obtained. 

In addition, Saripa et al. [17] shows how the maturity of watermelon can be predicted using 
ultraviolent near-infrared spectroscopy. The research demonstrated how the internal quality parameters of 
watermelon such as soluble solids content (SSC), ph level, firmness, and moisture content of the fruits can be 
used to predict the maturity and quality of the watermelon. Using the SVM model, the maturity prediction 
accuracy of 85% for the watermelon was achieved. Zhang et al. [18] explored an acoustic technique of kernel 
principal component analysis (KPCA) combined with machine learning to classify watermelons into unripe, 
ripe, and over-ripe. A set of fundamental electrical signals were obtained by hitting a microphone on the 
watermelons (unripe, ripe, and overripe), these electrical signals are converted to discrete signals. KPCA was 
used to extract features and patterns from the converted signals, three classifier models were built using the 
extracted features. The classifiers were tested, and an accuracy of 92% prediction was achieved. Ding et al. 
[19] assessed the firmness of watermelon as a key parameter to determine the internal quality of the fruit. The 
authors used the acoustic vibration technique combined with Near-infrared spectroscopy (NIR) spectroscopy 
to determine the firmness of the watermelon with a prediction accuracy of about 0.841. 

Furthermore, a study carried out by Xu et al. [20], discussed the relative relationship between the 
density of watermelon and hollow detection inside of the fruit. The study demonstrated with the use of 
Helmholtz resonance system, the volume system of watermelon can be obtained with the Helmholtz 
resonance, and hence, the density can be deduced. With an overall prediction percentage of 82.5%, it can be 
said that Helmholtz resonance showed great prospective for measuring the volume and internal quality of 
watermelon. Another work by Castro et al. [21], used machine learning techniques and color space models to 
develop a classifier for Cape gooseberry fruits. According to the author’s results, the classification of Cape 
gooseberry fruits by their ripeness level is sensitive to both the classification technique and the color space 
used. The study of [22], discussed how the sweetness of watermelon can be classified, it proposed a 
quantitative approach by conducting a sensory test to classify sweetness criteria and proofed; very sweet 
(with °Brix > 10), sweet (7 = °Brix < 10), and flat (°Brix < 7) and used machine learning techniques to build a 
better classifier which predicts sweetness of watermelon than NIR spectroscopy. Mao et al. [23], developed 
an acoustic device that measures the firmness of watermelon after studying the effect of hitting ball and fruit 
tray on the spectrum. Firmness index of fm and acoustic parameters were recommended to correlate the 
firmness of watermelon. Finally, De-la-Torre et al. [24] explored the multi-features extraction ability of five 
algorithms on color space models to determine the ripeness of Cape gooseberry fruits. The results presented 
show that selection and combination of color channels allow classifiers to reach similar levels of accuracy. 

This paper proposes a robust dual-classifier scheme microcontroller-based sensory system with 
machine learning (MCSS-ML), which is based on a MCSS and ML by exploring the flexibility of 
microcontroller in this case Arduino to integrate an JoT-based system that can be used to detect and classify 
watermelons (ripe, and unripe) using an IP-camera, humidity/temperature and gas sensors. Then, you only 
look once (YOLO) classification technique is used to develop a classifier model which identifies and 
classifies watermelon. Also, a method to improve the trained dataset is established by automatically adding 
watermelon successfully identified to the dataset. Datasets obtained by the system based on their conditions 
during testing were sent to the cloud for easy access, and the downloadable option of these datasets via a 
web-enabled device anywhere for easy reference is incorporated. The rest of this paper is structured as shown 
in: Section 2 describes the design flow and building block of the proposed dual classifier, while section 3 
discusses the results obtained from the developed dual-classifier on the watermelon and the performance 
evaluation of the developed classifier at different specifications and time is presented with further analysis to 
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validate the proposed system, Also, a cloud-based database application created to save the datasets is 
discussed; and finally, section 4 concludes the paper. 


2. METHODS 

MCSS-ML proposes a scheme built on the Arduino UNO microcontroller to receive, compute and 
send out process watermelon in real-time. The materials and methods used in analyzing each fruit sample is 
herewith stated. This scheme is robust because it integrates a lot of the features of industry 4.0 into it. 


2.1. Watermelon fruit samples 

The samples were purchased from a major fruit market called Shasha market, Ado Ekiti, Nigeria, 
located at [7°33'03.4"N 5°13'08.6"E]. The sample of twenty watermelons was used with different degrees of 
ripeness measured until the watermelons were rotten to know the threshold of ethylene gas produced by 
watermelons. The degree of ethylene in/emitted by fruits has been used to classify the ripening stages of 
fruits, studies showed a direct proportion in the increase of ethylene content of fruits as they get more 
ripening [25]-[28]. Having said that, an experiment was carried out for four weeks on twenty watermelons to 
obtain the maximum threshold limit of temperature, humidity, and ethylene values needed to determine if a 
watermelon is ripe or rotten in a controlled environment. These values are now used as benchmarks to 
classify if a watermelon is ripe or rotten. Note that most of the watermelons were physically rotten by the 
fourth week. The results of the experiment are summarized in Table 1. 


Table 1. Results of ripening stages’ parameters of watermelons 


Week Average Humidity Obtained (%) Average Temperature Obtained (°C) Average Ethylene Emission (ppm) 
1 65.54 33.29 1.96 
2 69.16 32.14 4.086 
3 69.53 30.86 10.68 
4 70.79 31.00 33.00 


The temperature and humidity values range from 30-33 °C and 65-71% respectively to produce a 
minimum ethylene emission of 1.96 ppm and a maximum ethylene emission of 33.00 ppm averaging across 
20 watermelons for a month. As seen in the descriptive table, not all watermelons were fully bad, but began 
to go bad faster after the ethylene value range of 50-70 ppm was passed, which shows that watermelons 
begin to rotten faster when exposed to a large sum of ethylene from its self or other watermelons and hence 
the threshold parameters were identified. (**This is not showing in the table and no appendix included) 


2.2. MCSS-ML 
2.2.1. Hardware Unit 
The hardware units of the system are described as shown in: 

- Microcontroller (Arduino ATmega328P): Arduino is an open-source, high-performance, and highly 
flexible microcontroller that permits analog and digital inputs and outputs (I/O ports), SPI, and serial 
interfaces, analog-to-digital converter channel, memory units, timer, and system clock. The functionalities 
to be developed are programmed using the C++ language. 

- JP-camera: An external HD camera of 108x720 resolution provides the input to the microcontroller which 
is placed about 35cm to the watermelon. 

- Temperature/humidity sensor (DHT11): DHT11 output temperature and humidity sensor which generates 
digital signal already calibrated. The sensor includes an 8-bit microprocessor for serial data output of 
temperature and humidity values. Due to being calibrated, it makes it easy to connect to other 
microcontrollers. With an accuracy of +1°C and +1%, the sensor can measure temperature from 0°C to 
50°C and humidity from 20% to 90% [29], [30]. 

- Gas sensor (MQ-3): This sensor is used for detecting ethanol, benzene gas, and ethylene, etc. The sensor 
is highly sensitive and can instantaneously obtain readings. The MQ-3 gas sensor's sensitive substance is 
tin (iv) oxide (SnOz), which has lower conductivity in fresh air [31]. The sensor is used to obtain ethylene 
gas emitted by the watermelons. 

- WiFi Module (ESP8622): The ESP8266 is a low-budget WiFi microchip produced by Espressif Systems, 
which allows the microcontroller to communicate wirelessly to a wireless network with a full TCP/IP 
using Hayes-style commands [32]. The module has powerful onboard processing and storage abilities that 
allow it to be incorporated into the design of mobile devices, wearables, and IoT-based systems. 
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2.2.2. Software unit 

Visual studio code (VSC) is the integrated development environment (IDE) used to develop the 
model in Python programming language which detects and classify watermelon using a trained YOLOv4-tiny 
algorithm. VSC has rich libraries for Python language which were used in the development of the classifier. 

Some of the important libraries used in the model are Open CV, NumPy, Pandas, CSV, OS, and JSON 

modules. 

- Google Colab: Google Colab is a Python development environment that runs on the browser used to train 
the watermelons and is integrated with Google cloud for a robust database system to save the dataset to be 
trained. Thereby allowing the user to have access to a free designated GPU. 

- ThingSpeak: is an open-source cloud-based database that allows real-time generated datasets to be 
cumulated, visualized, and analyzed from the cloud via an API. The generated data can be sent to 
ThingSpeak from devices and downloaded when needed from the cloud. Also, user-defined alerts can be 
set up as well [33]. In this case, watermelons being classified as either ripe or rotten will be saved to the 
cloud for easy access. 


2.3. Network/system architecture 

This YOLOv4-tiny which stands for “You Only Look Once version 4-tiny” is a lighter and 
compressed version of YOLOv4 deep learning algorithm developed using a backbone based on a 
convolutional neural network to increase the speed of real-time object detection on mobile devices, and the 
embedded system as applicable in this research work. YOLOv4-tiny uses CSPDarknet53-tiny network as the 
backbone network. The CSPDarknet53-tiny network uses CBLblock and CSPBlock for feature extraction 
and exploits the LeakyRelu function as an activation function for simple computation [34]-[38]. The 
LeakyRelu function is given as: 


5)ži x <0 (1) 
aj 


where ajis a constant factor greater than 1. 

The CBLblock covers the convolution operation, batch normalization, and activation function while 
CSPBlock splits the input feature map into two parts and concatenates the two parts in the cross-stage 
residual edge. Figure 1 shows the network structure of YOLOv4-tiny. 
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Figure 1. The system arrangement of YOLOv4-tiny 


The YOLOv4-tiny model is typically pre-trained on imagenet classification, pre-trained such that 
the network's weights have been adapted to detect relevant features in an image, but it was retrained using a 
watermelon dataset containing 549 images and 549 text files containing the bounding box coordinates of 
each watermelon in the image to suit the aim of this research. The structure of this model consists of 21 
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convolutional layers, 2 maxpool layers, an output layer, and an input layer, which have different 
specifications of batch normalization, filters, size, stride, pad, and activation. The whole structure of the pre- 
trained model is seen in the config file. And a label file that holds the names of all objects the model is 
expected to recognize. 


2.4. Method 

Training: The training was done using Google Colab with the Darknet YOLOv4 training 
environment installed. The watermelon dataset is linked up appropriately and a custom YOLOv4 training 
config file for Darknet is configured. The training of this custom model for the watermelon YOLOv4 object 
detector was run for 2,000 interactions with a run time of 2 hours and at an average of 0.5 seconds per 
iteration. Figures 2 and 3 show the model produce the best accuracy of 88.10%, an average loss of 0.30% 
which changes as the weight of the custom model is reloaded. With a learning rate of 0.000026 during these 
2,000 iterations, the model runs 96,000 images respectively. 


(next mAP calculation at 2000 iterations) 

Last accuracy mAP@@.5 = 88.10 %, best = 88.10 % 

1996: 0.273805, 0.294537 avg loss, @.000026 rate, @.540627 seconds, 95808 images, @.015532 hours left 
Loaded: 0.000042 seconds 


(next mAP calculation at 2000 iterations) 

Last accuracy mAP@@.5 = 88.10 %, best = 88.10 % 

1997: @.254799, 0.290564 avg loss, 0.000026 rate, @.538806 seconds, 95856 images, 0.015382 hours left 
Loaded: 0.000049 seconds 


(next mAP calculation at 2000 iterations) 

Last accuracy mAP@@.5 = 88.10 %, best = 88.10 % 

1998: @.269628, 0.288470 avg loss, 0.000026 rate, @.513975 seconds, 95904 images, 0.015233 hours left 
Loaded: 0.000052 seconds 


(next mAP calculation at 2009 iterations) 

Last accuracy mAP@@.5 = 88.10 %, best = 88.10 % 

1999: 0.309421, 0.290565 avg loss, 0.000026 rate, @.524612 seconds, 95952 images, 0.015084 hours left 
Loaded: @.808@5@ seconds 


(next mAP calculation at 2000 iterations) 
Last accuracy mAP@@.5 = 88.10 %, best = 88.10 % 
2000: 0.392564, 8.300765 avg loss, 0.000026 rate, @.511344 seconds, 96000 images, 8.014934 hours left 


calculation mAP (mean average precision)... 


108 
detections_count = 543, unique_truth_count = 156 
class_id = @, name = watermelon, ap = 87.79% (TP = 132, FP = 47) 


for conf_thresh = @.25, precision = 0.74, recall = @.85, Fi-score = 0.79 
for conf_thresh = @.25, TP = 132, FP = 47, FN = 24, average IoU = 52.52 % 


IoU threshold = 5@ %, used Area-Under-Curve for each unique Recall 
mean average precision (mAP@@.5@) = 0.877936, or 87.79 % 
Total Detection Time: @ Seconds 


Figure 2. Watermelon custom model showing the level of accuracy 


The watermelon is placed 35 cm to the HD camera to detect and classify the object placed in front 
as a watermelon using the model that has been trained using YOLOv4-tiny. When a positive classification 
has been made, the new image is saved to the trained dataset to improve the level of accuracy. The number of 
watermelons identified by the classifier is sent to the ThingSpeak (cloud database) by the WiFi module and 
this data be downloaded in a CSV format. 

Furthermore, after identification has been made of the watermelon, the two sensors drive into action 
by determining if the watermelon is ripe or rotten based on the threshold set. This is achieved with the use of 
conditional statements concatenate together to set the conditions discussed in Table 1 which determines if the 
watermelon is ripe or rotten based on the ethylene being emitted by the fruit. Also, the number of 
watermelons classified as either ripe or rotten are sent to the cloud and downloadable in CSV format. The 
MCSS-ML scheme is represented in the block diagram shown in Figure 4. 
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Figure 3. Graph showing the achieved accuracy and loss of training 
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Figure 4. Block diagram of MCSS-ML scheme for detecting/classifying watermelon 


3. RESULTS AND DISCUSSION 

MCSS-ML was tested in a controlled environment as the values used to build the system were 
obtained in a controlled environment. Figure 5 shows that the system makes a detection while visualizing the 
Detected watermelon within a bounding box, which shows the name/label of the object detected. 
Consistently, the accuracy of the object detected as watermelon when MCSS-ML was validated with 
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watermelon is between 0.9-1.0. Additionally, MCSS-ML determines if the watermelon detected is ripe or 
rotten based on the ethylene reading obtained from the sensors. Figure 6 shows the systematic classification 
of watermelon based on ethylene readings obtained by the sensors on the microcontroller. 


Figure 6. MCSS classifying ripe and rotten watermelons 


The dataset obtained based on the number of watermelons classified as either ripe or rotten was 
successfully sent to the ThingSpeak database using the WiFi module of the system and downloadable in a 
CSV format for easy referencing. Figure 7 shows the template of the CSV file downloadable from 
ThingSpeak. Also, the real-time values of the ethylene, temperature and humidity readings of the 
environment are taken and sent to the cloud for monitoring. This allows the system to be robust and creates a 
means to view the performance of the system in real-time. Figure 8 shows a real-time graphical display of the 
ethylene, temperature, and humidity readings obtained and sent to ThingSpeak. 
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Figure 7. CSV template of the downloaded file from ThingSpeak 
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Figure 8. Real-time graphical readings of the fruits’ conditions on ThingSpeak.com 


4. CONCLUSION 

A robust MCSS dual-classifier that uses real-time images and sensors as an input to detect and 
classify watermelon has been designed and implemented. The system was able to accurately detect 
watermelon with an accuracy of about 88.10% and a loss of 0.3. Also, a classification model was integrated 
into the system to determine if a watermelon is ripe or rotten, which was achieved to about 85-90% 
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prediction accuracy with the tested data. Furthermore, a cloud-based database is used to save data obtained 
from the monitoring system as well as recover data from it by the algorithm. This data recovered is then 
saved in a designated CSV file locally for easy access, documentation, and future reference. The developed 
dual-classifier can be deplored on a conveyor in a manufacturing line for maximal performance with a 
robotic hand that can sort these watermelons based on the information received from MCSS. Finally, the 
development and integration of a robotic hand into the system that can perform an accurate sorting of 
watermelons based on the classifier (ripe/rotten) is the next futuristic phase for the developed system. 
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